Deep learning-based diagnosis from endobronchial ultrasonography images of pulmonary lesions

Endobronchial ultrasonography with a guide sheath (EBUS-GS) improves the accuracy of bronchoscopy. The possibility of differentiating benign from malignant lesions based on EBUS findings may be useful in making the correct diagnosis. The convolutional neural network (CNN) model investigated whether benign or malignant (lung cancer) lesions could be predicted based on EBUS findings. This was an observational, single-center cohort study. Using medical records, patients were divided into benign and malignant groups. We acquired EBUS data for 213 participants. A total of 2,421,360 images were extracted from the learning dataset. We trained and externally validated a CNN algorithm to predict benign or malignant lung lesions. Test was performed using 26,674 images. The dataset was interpreted by four bronchoscopists. The accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of the CNN model for distinguishing benign and malignant lesions were 83.4%, 95.3%, 53.6%, 83.8%, and 82.0%, respectively. For the four bronchoscopists, the accuracy rate was 68.4%, sensitivity was 80%, specificity was 39.6%, PPV was 76.8%, and NPV was 44.2%. The developed EBUS-computer-aided diagnosis system is expected to read EBUS findings that are difficult for clinicians to judge with precision and help differentiate between benign lesions and lung cancers.


Methods
Patient population. For this observational retrospective cohort study, we obtained EBUS images of peripheral pulmonary lesions that were recorded between April 2017 and November 2019 at our institute. Bronchoscopy was performed in midazolam-sedated patients using a flexible bronchoscope (BF-P260F, BF-260, BF-6C260, or BF-1T260; Olympus Medical Systems, Tokyo, Japan). EBUS images were obtained using a miniature ultrasound probe (UM-S20-17S, UM-S20-20R, Olympus Medical Systems) and endoscopic ultrasound processors (Endoscopic Ultrasound Center; EU-ME1, Olympus Medical Systems). The inclusion criteria for EBUS images of malignancy were histopathologically confirmed cases of lung adenocarcinoma, squamous cell lung cancer, and small cell lung cancer, diagnosed either by surgery or bronchoscopic biopsy. The inclusion criteria for EBUS images of benign lesions were bacteriological diagnosis of histopathologically confirmed cases or the disappearance of lung shadows for a minimum of 6 months of follow-up. EBUS images of poor quality were excluded from the study due to the unclear depiction of lesions. Tumor lesions were visible on all images, and multiple images were collected for the same lesion to include different distances and angles. Lesions were selected by an experienced bronchoscopist (bronchoscopy specialist, 11 years of experience in bronchoscopy, and research experience related to EBUS imaging) to generate image datasets for deep learning models. This retrospective study was approved by the Shimane University Institutional Review Board (IRB study number: 5073). The requirement for informed consent was waived due to the retrospective nature of the study, which was approved by the Shimane University Institutional Review Board. This study was conducted in accordance with the amended Declaration of Helsinki.
Data preprocessing. Data augmentation was used to increase the variation of the image. Previous reports have shown that these techniques are effective in improving the accuracy of recognition and classification for analysis with endoscopic ultrasonography images 8 . The dataset augmentation methods used were rotation, inversion, and enlargement. Data augmentation was applied to the training image.
Model development. Our convolutional neural network (CNN) structure is shown in Fig. 1. First, the EBUS image was input to the feature extraction CNN. In the first block of the CNN, local features, such as edges and textures, were extracted from the input image. When passing through a network, the features were integrated. Finally, it was converted into a feature that was useful for discrimination between benign and malignant lesions. Next, these useful features were input into the classification neural network. In neural network classifica- Outcome measures. The entire data was divided into training data and test data to check the accuracy of the model. Using hold-out validation, the images were divided into training (80% of patients) and test sets (20% of patients) (Fig. 2). Data with a new date were used as the test sets. The classification provided by the CNNcomputer-aided diagnosis (CAD) system was compared with the histopathology results. Accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were used as evaluation indexes.
To provide a comparison of the classification performance of the CNN-CAD system, four bronchoscopists were tasked with evaluating the test sets. Among them, two bronchoscopists were classified as Expert 1 (bronchoscopy specialist, 35 years of experience in bronchoscopy, research works related to EBUS imaging) and Expert 2 (bronchoscopy specialist, 7 years of experience in bronchoscopy, and research works related to EBUS imaging). The others were classified as Trainee 1 (bronchoscopy specialist, 12 years of experience in bronchoscopy) and Trainee 2 (5 years of experience in bronchoscopy) after being trained in interpreting EBUS images. The bronchoscopists received the original EBUS image without information on the CNN-CAD system classification results and provided their own classifications (benign or malignant).
Visualization of the CNN-CAD system. CNN-based models provide excellent performance, but they lack intuitive components and are difficult to interpret. To better understand the prediction process of deep learning models, we used visualization techniques such as a novel class-discriminative localization technique and gradient-weighted class activation mapping 9 .
Statistical analysis. Statistical analyses were performed using R (version 3.6.2, R Foundation for Statistical Computing, Vienna, Austria). Quantitative variables were reported as mean and standard deviation, and qualitative variables were reported as frequency and percentage. Categorical data were analyzed using Fisher's exact test. Accuracy, sensitivity, specificity, PPV, and NPV between the bronchoscopists and the CNN-CAD system were expressed as percentages. The accuracy was compared using the McNemar test. Statistical significance was defined as a P-value < 0.05.

Results
Clinicopathological and patient characteristics. After applying the inclusion and exclusion criteria,  (Fig. 2). We then collated an independent test dataset of 42 peripheral pulmonary lesions that had been recorded from June 2019 to November 2019 (16 adenocarcinoma, 9   (Table 3). For each case, when the ratio of images estimated to be correct was 50% or more, the result was judged to be correct. Even if the positional relationship between the probe and the lesion was "adjacent to, " malignancy could be diagnosed (Fig. 3) (Table 3). As with the CNN-CAD system, for the four bronchoscopists, if more than 50% reached the correct decision, the result was judged to be correct. On comparison, the accuracy of the CNN-CAD system was found to be higher than that of the four bronchoscopists (p = 0.0433) ( Table 4). Visualization was performed using gradient-weighted class activation mapping. The regions of interest (malignant lesions) by CNN were visualized in red, and the regions of interest (benign lesions) were visualized in blue (Fig. 4).

Discussion
Our CNN-CAD system differentiated lung cancer from benign lung lesions with an accuracy of 83.4% in the independent test dataset. Furthermore, in a case-by-case analysis, the CNN-CAD system achieved a sensitivity of 100%. To the best of our knowledge, this is the first study to report the efficacy of the CNN-CAD system in distinguishing lung cancer from EBUS images.
The findings obtained by EBUS show the positional relationship between the ultrasonic probe and the lesion, and these positions are roughly divided into three patterns: within (lesion visualized all around), adjacent to (visualized adjacent to the core lesion), and invisible (lesion not visualized at all). The diagnosis rate differs depending on this positional relationship, and if the tumor cannot be physically reached by the biopsy device, tissue cannot be collected and the diagnosis rate drops to approximately 60% or less 1-3 . This study included adjacent to cases for both learning and testing datasets (Fig. 2). An accuracy of 83.4% was demonstrated, which points towards a room for improvement, but the technique may be useful in the cases where the EBUS probe is adjacent to the lesion.
Regarding the use of ultrasound images in bronchoscopy, it has been reported that convex probe endobronchial ultrasound sonographic images are useful. In a previous study 10 , a deep learning model was used to determine whether the mediastinal lymph nodes were benign or malignant. The accuracy was reported as 88.57%. Use of AI enables real-time diagnosis of a lesion, and if benign and malignant lesions can be distinguished based on the ultrasonic images, unnecessary biopsy can be avoided. We believe that diagnostic assistance using AI is useful not only for improving the accuracy of diagnosis but also for maintaining safety.
Methods for distinguishing benign and malignant lesions by using EBUS, which were based on the internal structure of the lesion, have been reported in the literature. The focus was on internal echo, bronchial and vascular patency, and morphology of the hyperechoic region 4 . In this study, the visualization of CNN-CAD system suggests that AI pays attention not only to the internal structure but also to the edges. AI may reflect differences that are undetectable to the human eye, such as echo attenuation.
One limitation of this study is that it was an observational study conducted in a single facility. However, virtual bronchoscopic navigation 11 or electromagnetic navigation bronchoscopy systems 12,13 have emerged as a means of supporting biopsy-based diagnosis of peripheral pulmonary lesions. The existing technical differences between the various facilities are being equalized by using an image-guided system. . Accuracy for each case. For each case, when the ratio of images estimated to be correct was 50% or more, it was judged to be correct. When the endobronchial ultrasound visualization was adjacent to case, the graph showed a sprite pattern. www.nature.com/scientificreports/ Another limitation of this study was the lack of data on benign diseases, which reduced the specificity of the obtained results. However, due to the nature of bronchoscopy itself, which deals mainly with malignancies, sensitivity takes precedence over specificity. As a multicenter study, it is also necessary to collect data on benign diseases. In the future, we aim to conduct studies to determine whether real-time evaluation of EBUS data during bronchoscopy can contribute to the diagnostic accuracy.
In conclusion, we can state that use of CNN-CAD system for diagnosing peripheral pulmonary lesions aids in the accurate diagnosis of lung cancer.

Data availability
The datasets generated and analyzed during the current study are not publicly available due to the waiver of the requirement of consent from patients, but are available from the corresponding author on reasonable request. The data provided will be de-identified, not raw data.